Hyper Markov Non-Parametric Processes for Mixture Modeling and Model Selection

نویسنده

  • Daniel Heinz
چکیده

Markov distributions describe multivariate data with conditional independence structures. Dawid and Lauritzen (1993) extended this idea to hyper Markov laws for prior distributions. A hyper Markov law is a distribution over Markov distributions whose marginals satisfy the same conditional independence constraints. These laws have been used for Gaussian mixtures (Escobar, 1994; Escobar and West, 1995) and contingency tables (Liu and Massam, 2006; Dobra and Massam, 2009). In this paper, we develop a family of non-parametric hyper Markov laws that we call hyper Dirichlet processes, combining the ideas of hyper Markov laws and non-parametric processes. Hyper Dirichlet processes are joint laws with Dirichlet process laws for particular marginals. We also c describe a more general class of Dirichlet processes that are not hyper Markov, but still contain useful properties for describing graphical data. The graphical Dirichlet processes are simple Dirichlet processes with a hyper Markov base measure. This class allows an extremely straight-forward application of existing Dirichlet knowledge and technology to graphical settings. iv Given the wide-spread use of Dirichlet processes, there are many applications of this framework waiting to be explored. One broad class of applications, known as Dirichlet process mixtures, has been used for constructing mixture densities such that the underlying number of components may be determined by the data (Lo, 1984; Escobar, 1994; Escobar and West, 1995). I consider the use of the new graphical Dirichlet process in this setting, which imparts a conditional independence structure inside each component. In other words, given the component or cluster membership, the data exhibit the desired independence structure. We discuss two applications. Expanding on the work of Escobar and West (1995), we estimate a non-parametric mixture of Markov Gaussians using a Gibbs sampler. Secondly, we employ the Mode-Oriented Stochastic Search of Dobra and Massam (2009) for determining a suitable conditional independence model, focusing on contingency tables. In general, the mixing induced by a Dirichlet process does not drastically increase the complexity beyond that of a simpler Bayesian hierarchical models sans mixture components. We provide a specific representation for decomposable graphs with useful algorithms for local updates.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Thesis Proposal: Non-parametric Hyper Markov Priors

Markov distributions are used to describe multivariate data with conditional independence structure. Applications of Markov distributions arise in many fields including demography, flood prediction, and telecommunications. A hyper Markov law is a distribution over the space of all Markov distributions; such laws have been used as prior distributions for various types of graphical models. Dirich...

متن کامل

Factorized Asymptotic Bayesian Hidden Markov Models

This paper addresses the issue of model selection for hidden Markov models (HMMs). We generalize factorized asymptotic Bayesian inference (FAB), which has been recently developed for model selection on independent hidden variables (i.e., mixture models), for time-dependent hidden variables. As with FAB in mixture models, FAB for HMMs is derived as an iterative lower bound maximization algorithm...

متن کامل

Non-parametric Bayesian modeling of complex networks

Modeling structure in complex networks using Bayesian non-parametrics makes it possible to specify flexible model structures and infer the adequate model complexity from the observed data. This paper provides a gentle introduction to non-parametric Bayesian modeling of complex networks: Using an infinite mixture model as running example we go through the steps of deriving the model as an infini...

متن کامل

A new non-parametric approach for suppliers selection

In this paper we propose a simple non-parametric model for multiple crite-ria supplier selection problem. The proposed model does not generate a zeroweight for a certain criterion and ranks the suppliers without solving the modeln times (one linear programming (LP) for each supplier) and therefore allowsthe manager to get faster results. The methodology is illustrated using anexample.

متن کامل

A sampling-based speaker clustering using utterance-oriented Dirichlet process mixture model and its evaluation on large scale data

An infinite mixture model is applied to model-based speaker clustering with sampling-based optimization to make it possible to estimate the number of speakers. For this purpose, a framework of non-parametric Bayesian modeling is implemented with the Markov chain Monte Carlo and incorporated in the utterance-oriented speaker model. The proposed model is called the utterance-oriented Dirichlet pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015